A Fast Approach to Creative Telescoping 

Christoph Koutschan 



Abstract. In this note we reinvestigate the task of computing creative telescoping relations in 
differential-difference operator algebras. Our approach is based on an ansatz that explicitly includes 
the denominators of the delta parts. We contribute several ideas of how to make an implementation 
of this approach reasonably fast and provide such an implementation. A selection of examples 
shows that it can be superior to existing methods by a large factor. 
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1. Introduction 

The method of creative telescoping nowadays is one of the central tools in computer algebra for 
attacking definite integration and summation problems. Zeilberger with his celebrated holonomic 
systems approach ifTTll was the first to recognize its potential for making these tasks algorithmic for 
a large class of functions. In the realm of holonomic functions, several algorithms for computing 
creative telescoping relations have been developed in the past. The methodology described here is 
not an algorithm in the strict sense because it involves some heuristics. But since it works pretty 
well on nontrivial examples we found it worth to be written down. Additionally we believe that 
it is the method of choice for really big examples. Our implementation is contained in the Mathe- 
maticapackage HolonomicFunctions as the command FindCreativeTelescoping. The 
package can be downloaded from the RISC combinatorics software webpage: 



http://www.risc.uni-linz.ac.at/research/combinat/software/ 



Throughout this paper we will work in the following setting. We assume that a function / to 
be integrated or summed satisfies some linear difference-differential relations which we represent in 
a suitable operator algebra (Ore algebra). We use the symbol D x to denote the derivation operator 
w.r.t. x and Sn for the shift operator w.r.t. n. Such an algebra can be viewed as a polynomial ring 
in the respective operators, with coefficients being rational functions in the corresponding variables, 
subject to the commutation rules D x x — xD x + 1 and SnTi = nSn + Sn. Ideally, all the relations for / 
generate a 9-finite left ideal, i.e., a zero-dimensional left ideal in the operator algebra. If addition- 
ally / is holonomic (a notion that can be made formal by D-module theory), then the existence of 
creative telescoping relations is guaranteed by theory (i.e., by the elimination property of holonomic 
modules). Chyzak, Kauers, and Salvy have shown that creative telescoping is also possible for 
higher-dimensional ideals under certain conditions. We tacitly assume that any input to a creative 
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telescoping algorithm is 3-finite and holonomic, and that it is given as a left Grobner basis G of the 
annihilating ideal of /. 

The main concern of this paper is to compute creative telescoping relations for an integrand 
(resp. summand) f(v, w) where v is the integration (resp. summation) variable and w = Wi, W2, • • • 
are some additional parameters. In other words, we are looking for annihilating operators for f(v, w) 
of the form 

P(w,d w ) + d v ■ Q(v, d v ,w,d w ) (1.1) 

where d v and d v stand for operators acting on the variable v (d v = d v — D v in the case of an integral, 
and d v = Su and d v = S v — 1 in the summation case), and d w are the operators d wi , d W2 , . . . that 
correspond to the extra parameters. We will refer to P as the principal part (also known as the 
telescoper), and to Q as the delta part. From ( 11.11 ) it is immediate to derive relations for the definite 
integral (resp. sum), which in general can be inhomogeneous. Similarly, multiple integrals and sums 
can be done by creative telescoping relations that correspond to an operator of the form 

P(w, d w ) + d Vl ■ Q 1 (v,w,d v ,d w ) + d V2 ■ Q 2 (v,w,d v ,d w ) + ... (1.2) 

where v = v\ , V2, ■ ■ ■ denote the integration (resp. summation) variables (also mixed cases are 
possible); we will use the notation v a — v^v^ 2 • ■ • . Note that in general the application of ( 11.2b 
yields a relation with inhomogeneous right-hand side that consists of integrals (resp. sums) of one 
dimension lower. These can be treated recursively in the same way. 



2. Description of the method 

The known algorithms for computing creative telescoping relations for holonomic functions are 
either based on elimination (e.g., by rewrite rules / Grobner bases) or on the use of an ansatz with 
undetermined coefficients. Zeilberger's slow algorithm [17| and Takayama's algorithm lfl4l [131 171 
fall into the first category. Their advantage is that they can deal with multiple integrals and sums, 
but elimination can be a very difficult task and moreover, it is not guaranteed that they deliver the 
smallest relations that exist in the given annihilating ideal. A relatively small (and hypergeometric!) 
example that so far has resisted the elimination approach is given in Section l3~3l Since the algorithms 
that we are going to discuss here fall into the second category, we do not want to go into further detail 
with elimination-based algorithms. 

All the other algorithms make an ansatz with undetermined coefficients. Reduction modulo the 
Grobner basis G gives its normal form representation. For the creative telescoping relation to be in 
the ideal generated by G, its normal form must be identically zero. Hence equating all coefficients of 
the normal form to zero gives rise to a system of equations that can be solved for the undetermined 
coefficients. The algorithms described below differ only in the shape of the ansatz. 

A classical algorithm that is based on an ansatz has been proposed by Chyzak [5 |. It can only 
be applied to single integrals or single sums, and has to be used in an iterative way for multiple ones. 
With the notation of (II . lb its ansatz is of the following form: 

^2pp(w)d& + d v ■ ^2 q 1 (v,w)(d v ,d w )~ < (2.1) 

/3GB ~l£U 

where B is a finite multi-index set and U is the finite set of multi-indices that correspond to the 
monomials under the stairs of the Grobner basis G. The unknown pp{w) are rational functions 
in K(w), and the unknown q-y(v, w) are rational functions in K(v,w). When the ansatz d2.U is 
written in standard operator representation, i.e., when d v is commuted to the right, we encounter 
derivatives (resp. shifts) of the q-y(v, w) with respect to v. Finally, we end up with a coupled linear 
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first-order system of differential (resp. difference) equations. All implementations of Chyzak's algo- 
rithm that we know o{] uncouple this system and then solve the resulting scalar equations one by 
one. Experience shows that these steps can be extremely costly, in particular when U is big. As an 
alternative to uncoupling there are algorithms for directly solving such coupled systems, proposed 
by Abramov and Barkatou flUBl. A comparison of how these methods perform on big creative tele- 
scoping examples could be an interesting research topic for the future. However, in this article we 
want to follow a different way of bypassing the bottleneck, namely by means of a different ansatz 
that does not lead to a coupled system. 

In ifTOlfTTl a first step into this direction has been taken by means of a "polynomial ansatz" of 
the form 

J2Pl3(w)dg + d Vl ■ J2 E qx,c t ,-r(w)v a (d v ,d w y+ ... (2.2) 

where Ai, B, and Cj are finite sets of multi-indices, and the dots hide terms with d V2 , . . . . The 
unknown pp and qi t(xn to solve for are rational functions in the surviving variables w and they 
can be computed using pure linear algebra, without any uncoupling needed (since commuting the 
d Vi to the right will not affect them). Note also that the summation variables will not occur in the 
denominators. The price that we pay is that the shape of the ansatz is not at all clear from the 
beginning: The sets Ai, B, and Cj need to be fixed, whereas in Chyzak's algorithm we have to 
loop only over the support of the principal part (the set B). Another drawback of this ansatz is the 
fact that the delta parts in this denominator-free representation usually have not only much bigger 
supports (|Cj| > \U\), but also higher polynomial degrees and larger integer coefficients than the 
reduced representation returned by Chyzak's algorithrrQ Of course, once found, the solution of the 
polynomial ansatz can be transformed to reduced representation yielding the same result as Chyzak's 
algorithm in the one-dimensional case (just reduce the delta part with the Grobner basis G). So we 
have now got rid of the uncoupling problem, but the above observation suggests that the result of the 
polynomial ansatz is blown up unnecessarily and that we can do even better. 

And in fact, we can! The only thing we have to do is to include the denominators of the delta 
parts into the ansatz: 

+ ^-E E q T(vw) i9M " + •■• (23) 



where the notation is as in (12. 2t except that we can restrict the support of the delta part to the mono- 
mials under the stairs of G (as in Chyzak's algorithm). The denominators di^ are polynomials in 
K[v, w]. After coefficient comparison with respect to v we finally have to solve a linear system in the 
pp and the qi tOL ,-y over K(to). This means that when we will be talking about the denominators 
we will refer solely to those parts (factors) of the d% n that actually involve some of the variables v; 
the remaining factors will be contributed from the solutions over K(io). 

Now it is no secret that the denominators di n can be somehow predicted: We do not know 
them a priori, but we can deduce a list of candidate factors that might appear within them. In the 
hypergeometric case, Wilf and Zeilberger lfl6l already have described how to get a good guess on 
the denominators. In general it is not difficult to see that the leading coefficients of the Grobner 
basis G play the key role: consider a solution of the form ( 12.21 ) (that is guaranteed to exist by the 
elimination property, provided that / is holonomic) and reduce its delta part with G to obtain a 
representation of the form ( 12.31 ). It is now clear that a solution must exist where only factors from 
the leading coefficients of G as well as their shifted instances appear in the denominators. 



1 Besides our Mathematica package HolonomicFunctions, this refers to Chyzak's implementation which is part of the 
Maple package Mgf un I http://algo.inria.fr/libraries/ 1. 

2 To give some quantitative results: in the TSPP example, see Sectionf4] we ended up with polynomial degrees up to 764 and 
integer coefficients with up to 378 digits, whereas the reduced representation has degree 120 and at most 74-digit integers. 
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The rest of this section will be dedicated to explaining how this technique can be completely 
automatized and made reasonably fast. For example, the implementations of the hypergeometric case 
that use ansatz i2.3\ (by Zeilbergejf] in Maple, and by Wegschaidei@ in Mathematica) require the 
denominators explicitely as input. Moreover, Wegschaider in his thesis [ 15] states that ansatz ( 12.2b is 
preferable to ( 12.31 even when the denominators are known. We do not share his opinion, as we shall 
demonstrate below. Or more concretely, we do not think that this statement holds in the general (not 
necessarily hypergeometric) case. 

In all ansatz-based algorithms (and hence in our approach), the main loop is over the support 
of the principal part. In the following we concentrate on one single step, i.e., we assume that the 
set B is fixed, and describe our implementation by explaining the various optimizations that we have 
included. 

Optimization 1. To get started with, we need some heuristics for guessing the denominators di n . 
It seems to be natural to start with a common denominator d, i.e., di n \ d for all i and all 7. 
Ideally, a good heuristic should always deliver a d that is indeed a multiple of all the denominators, 
but without overshooting too much. Experiments suggest that it suffices to take the denominators 
that occur during the reduction of the whole ansatz; recall that its support is already fixed which 
allows for a very fast "simulated reduction" where we do not compute with coefficients but only 
with supports. 

Optimization 2. Once we have a good candidate for the common denominator we have to test 
whether for this setting, i.e., for the principal part under consideration, there exists a solution. Ad- 
ditionally we still have to fix the degrees of the ansatz (the sets Ai); see the next step for this issue. 
To see whether there is a solution we have to reduce the ansatz with the Grobner basis G and solve 
the corresponding linear system. And of course, it suffices to perform these steps in a homomorphic 
image. Hence we plug in some concrete integers for the parameters w and reduce all integer coef- 
ficients modulo a prim^E If there is no solution in the homomorphic setting, there is, a fortiori, no 
solution in the general setting. By choosing these values sufficiently generically we can also mini- 
mize the risk of obtaining a homomorphic solution that does not extend to a general solution. There 
is one important point to mention and this concerns the reduction modulo G: we are working in a 
noncommutative algebra and therefore it is problematic to replace an indeterminate by an integer, 
since by doing so, we loose the noncommutativity between this indeterminate and the operator that 
is connected to it. In 1111 we have described a modular reduction procedure that keeps track of this 
issue. The basic idea is that in each step of the reduction process we have to do the noncommutative 
multiplication of a Grobner basis element that makes the leading power products match, in the gen- 
eral setting, but everything else in the homomorphic setting since no noncommutativity is involved 
any more. 

Optimization 3. We still do not know which ■u-degrees in the numerators of our ansatz we should 
try. It is manifest to start with small degrees and increase them until either the homomorphic com- 
putations indicate that a solution might exist or the degrees become unreasonably large (hence here 
is a second heuristic involved). The following observation suggests that we can be quite generous 
with setting an upper bound for the degrees. Let T be the ansatz ( 12.3b and T' its counterpart with the 
increased degrees (for this reasoning it is irrelevant whether we talk about the total degree in v or 
the componentwise degrees; in any case we have some A! i D Ai, i — 1, 2, . . . ). Then the unknowns 
that appear in T' — T are precisely the qi, ai ,-y with a.i E A[\ Ai. Obviously the sets of undeter- 
mined coefficients occurring in T and T' — T have empty intersection. Hence we do not have to 
build the whole linear system from scratch, but instead in each step of the degree-looping we have 
to reduce only the new part T" — T and add some columns (and possibly rows) to the matrix. In our 

a http://www.math. rutgers.edu/~zeilberg/programs. html 
4 http://www.risc.uni-linz.ac.at/research/combinat/software/MultiSum/ 

5 in our implementation, we choose 7-digit integers for the w and as modulus the largest prime that fits into a machine word, 
i.e., 2147483629. 
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implementation we have decided to loop over an integer S such that 5 limits the degree of each of the 
variables v. Once a homomorphic solution is found for a certain 5 we can refine the degree setting 
componentwise by a few more modular nullspace computations. 

Optimization 4. Ok, let's now assume that we found a homomorphic solution for some principal 
part, some common denominator d and some degree setting A,*. We could now use this ansatz to 
start the final computation, but there is still lots of possibilities for improvement. Recall that at this 
point we only have some common multiple of the denominators, but not necessarily the minimal 
one. Again using homomorphic computations, it is not difficult and not very costly to figure out the 
least common multiple of all the denominators: for each factor of d we check whether deleting this 
factor and decreasing the degree setting of the numerators accordingly, still yields a solution. If so, 
we can remove this factor from our ansatz. Note that such unnecessary factors in the denominator 
blow up the degrees of the numerator, too, since they have to cancel in the end. 
Optimization 5. But should we stop after minimizing the common denominator? In the very same 
fashion, we can now proceed to minimize each single denominator d^. Note that for small exam- 
ples this overhead can consume a considerable part of the total computation time. However, we are 
definitely convinced that it pays off in big examples. 

Optimization 6. Last but not least we omit all undetermined coefficients from our ansatz that are 
zero in the homomorphic image. They most probably will be zero in the end — but this is folklore. . . 



3. Examples 

In this section we want to present some examples that illustrate the applicability of our ansatz. To 
have a fair comparison of the timings, we do all computations in the same computer algebra system 
(Mathematica) and we want to mention that all the code has been implemented by the same person 
(unless stated otherwise). The results are listed in Table [T] 





Ansatz (2.1) 


Ansatz (2.3) 


Example 


time 


memory 


output 


time 


memory 


output 


Bessel 


127 


78 


7.2 


10 


1.9 


7.2 


Gegenbauer 


1122 


601 


13 


1.6 


0.66 


13 


Andrews-Paule 


27 


30 


341 


2.1 


0.35 


4.3 


Feynman (wz) 
Feynman (zw) 


81 
171 


156 
126 


293 
91 


174 


85 


156 



Table 1 . Timings (in seconds), memory usage (in MegaBytes), and output size 
(in KiloBytes) for Chyzak's algorithm and our ansatz; the Feynman example con- 
sists of two rows corresponding to the different orders of integration (in the double 
sum, changing the summation order does not lead to different values). 



3.1. Integral with four Bessel functions 

An example that is often used for testing creative telescoping procedures is the following integral 
over a product of four Bessel functions 

/ xJ l {ax)h(ax)Y (x)K (x)dx = - 1 ° S f > 1 ~ ° \ (3.1) 
Jo 2vra 2 
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The intriguing fact with this example is that the input, the annihilating ideal for the integrand 

{a 3 D* + 4a 2 D 3 - 3aD 2 + 3D a + Aa 3 x i , 
x A D* - 4ax 3 D 3 D a + 6a 2 x 2 D 2 D 2 - 4a 3 xD x D 3 + 12ax 2 D 2 D a - 24a 2 xD x D 2 + 
8a 3 D 3 + x 2 D 2 - 26axD x D a + A0a 2 D 2 - 3xD x + 2&aD a - 4a 4 ir 4 + Ax A + 3} 

as well as the output, the creative telescoping operator 

aD a + 2 + 

D x ■ 4(q4 I 1)x3 ■ (~ax 3 D 3 D a + 4a 2 x 2 D 2 D 2 - 6a 3 xD x D 3 - 2x 3 D 3 ^ ^ 

+l2ax 2 D 2 D a - 32a 2 xD x D 2 + 16a 3 D 3 - 25axD x D a 
+70a 2 D 2 - 2xD x + 19aD a - 16a 4 a; 4 + 2) 

are pretty small. In particular, observe that the principal part is of order 1, and hence no longish 
looping is necessary. But the integrand, being a product of four Bessel functions (which are 9-finite 
with dimension 2), has an annihilating ideal that contains 16 = 2 • 2 • 2 • 2 monomials under its 
stairs, and this causes Chyzak's algorithm to take quite long with finding ( 13.21 i (recall that a 16 by 16 
system of differential equations has to be uncoupled which causes intermediate expression swell). 
Therefore the timings and memory usages differ by more than one order of magnitude: 10s vs. 127s, 
and 1.9MB vs. 78MB, respectively. 

3.2. A product of three Gegenbauer polynomials 

The following identity can be found as formula (6.8.10) in the book by Andrews, Askey, and Roy J2). 
It is valid when A > — ^ and A ^ 0, 1 + m + n is even and the sum of any two of I, m, n is not less 
than the third. The integral is zero in all other cases: 

J -i T{\y + m + n) + A) 

W(m+n-l)/2W(l+n-m)/2W(l+m-n)/2 

X (i(m + n - l))l (±(Z +n-m))\ +m- n))!(A) (i+m+n)/2 

One creative telescoping operator that can be used to prove this identity has the innocent-looking 
principal part 

(l + m-n+ + 2\-m + n- l)S, n - (I -m + n + + 2A + m - n - l)Sn. 

We fix the support {Sm, Sn} and compare the runtime of the different ansatze. With our implemen- 
tation of Chyzak's algorithm we need 1 122 seconds to get the above, whereas our new approach can 
do it in 1.6 seconds! Again the reason is that the vector space under the stairs of the Grobner basis 
has dimension 8 which is already relatively large and leads to huge intermediate expressions (see 
TableQ). 

3.3. A hypergeometric double sum 

The following double sum has been studied by Andrews and Paule [ 3 1 from a human and a computer 
algebra point of view: 

1 3 

This sum being hypergeometric can be treated with both Zeilberger's Multi-WZ implementation in 
Maple and Wegschaider's MultiSum package in Mathematica (see the footnotes on page|4]i. We 
take the latter for a comparison. As already mentioned above, we have to give the correct denomina- 
tors of the delta parts as input. Then the command FindRationalCertif icate takes 0.5s to 
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find the creative telescoping operator 

i (2ij ~in + i + 2j 2 - 3jn + 2j - 3n) 



1 - (S - 1) 
(Sj 1) 



CJ + !)(* + J ~2n) 
j (2i 2 + 2ij — 3m + 2i — jn + j ' — 3n) 



(i + l)(i+j-2n) 

Our implementation, having to figure out the denominators on its own, takes a little bit longer, 
namely 2.1s. We believe that this is still reasonably fast and bet that every user would need more 
than 1 .6 seconds even for only typing the denominators ! Using Wegschaider's implementation of the 
ansatz ( 12.21 (his favourite method) takes 3.2s and the output is not as nice as the one given above 
(memory usage: 9MB, output size: 200KB). 

We have mentioned that Chyzak's algorithm in principle can be iteratively applied to solve 
multi-summation problems. This example spectacularly demonstrates that this strategy can end up 
in a long, stony way: The creative telescoping relations for the inner sum are huge and the tricky 
boundary conditions make things even more difficult. 

3.4. Feynman integrals 

Another interesting area of application is the computation of Feynman integrals that is a hot topic in 
particle physics. We borrow a relatively simple example from the thesis [9, (J.17)]: 

/ / —, M (I - w n+1 - (I - w) n+1 ) dwdz. 

Jo Jo (z + w-wz) 1 ^ v \ ) J 

The integrand is not hyperexponential and hence the multivariate Almkvist-Zeilberger algorithm is 
not applicable (at least not without reformulating the problem). Our implementation needs 174s to 
find the third-order recurrence in n (but can be reduced to 24s by use of options). Again we can 
try Chyzak's algorithm iteratively on this example, running into similar troubles as in the previous 
example, although the swell of the intermediate expressions is by far not as bad as before (note 
that the input Grobner basis has only 3 monomials under the stairs). Depending on the order of 
integration this takes 81s or 171s, but on the cost of higher memory consumption (see TableQ]). 



4. Conclusion and Outlook 

We have described an approach to finding creative telescoping relations that is particularly inter- 
esting for multiple sums and integrals as well as for big inputs. For small examples the necessary 
preprocessing might well consume most of the computation time, but the bigger the input is the less 
this carries weight. By avoiding the expensive uncoupling step, our ansatz becomes more attractive 
as the size of the set U (monomials under the stairs) grows. From a theoretical point of view Chyzak's 
algorithm is still preferable since it is guaranteed to find the smallest creative telescoping relations 
(with respect to the support of the principal part) whereas our approach involves some heuristics 
which in unlucky cases can prevent us from getting the minimal output. Therefore we have incor- 
porated several options into our implementation (e.g., for fixing the support of the principal part, 
the common denominator in the delta part or its numerator degrees) that allow the user to override 
the built-in heuristics. In the rare cases where this leads to different results, we can compute their 
union (corresponding to the right gcd in the univariate case) and with high probability end up with 
the minimal telescoper. 

We want to conclude with a really big example to which we plan to apply our method in 
the near future. In ifTTI we have presented a computer proof of Stembridge's TSPP theorem using 
the polynomial ansatz ( 12.21 ). The computations for achieving this took several weeks! Attacking the 
notorious g-TSPP conjecture, which is the g-analogue of Stembridge's theorem, therefore seemed to 
be hopeless due to the additional indeterminate q that blows up all computations by some orders of 
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magnitude. With our fast implementation of ansatz (12.3b we can now find the creative telescoping 
relations for the ordinary TSPP in a few hours! This fact, together with the previous work done in j8) 
makes it likely that the g-TSPP conjecture can be turned into a theorem after being open for about 
25 years. 

Addendum to the final version. Meanwhile we have succeeded in proving the long-standing q- 
TSPP conjecture using exactly the methods described in this paper. The corresponding article has 
already been submitted for publication fl2l . 
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